Processing Spontaneous Orthography

نویسندگان

  • Ramy Eskander
  • Nizar Habash
  • Owen Rambow
  • Nadi Tomeh
چکیده

In cases in which there is no standard orthography for a language or language variant, written texts will display a variety of orthographic choices. This is problematic for natural language processing (NLP) because it creates spurious data sparseness. We study the transformation of spontaneously spelled Egyptian Arabic into a conventionalized orthography which we have previously proposed for NLP purposes. We show that a two-stage process can reduce divergences from this standard by 69%, making subsequent processing of Egyptian Arabic easier.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Linguistics & Chinese Language Processing Aims and Scope Contents Special Issue Articles: Processing Lexical Tones in Natural Speech Implicit Priming Effects in Chinese Word Recall: the Role of Orthography and Tones in the Mental Lexicon

This paper explores the relative contributions made by orthography, syllabic segment, and lexical tone in the word recognition and retrieval process. It also challenges recent assumptions regarding the role of orthography and tones in mental lexicon architecture. Using an implicit priming paradigm, a word recognition experiment was conducted with native speakers of two tonal languages, Chinese ...

متن کامل

Paying attention to orthography: a visual evoked potential study

In adult readers, letters, and words are rapidly identified within visual networks to allow for efficient reading abilities. Neuroimaging studies of orthography have mostly used words and letter strings that recruit many hierarchical levels in reading. Understanding how single letters are processed could provide further insight into orthographic processing. The present study investigated orthog...

متن کامل

The Effect of L1 Persian on the Acquisition of English L2 Orthographic System on the Shared Grounds

This paper elaborates on Persian and English orthographic shared aspects to study the effects of L1 Persian on learning English as a foreign language. While there are some examples of letter and sound mismatches in the orthographic system of both languages, those of English are more complex than Persian. In order to see the effect of the mismatch between orthography and transcription, 40 Persia...

متن کامل

Metalinguistic awareness and reading performance: a cross language comparison.

The study examined two questions: (1) do the greater phonological awareness skills of billinguals affect reading performance; (2) to what extent do the orthographic characteristics of a language influence reading performance and how does this interact with the effects of phonological awareness. We estimated phonological metalinguistic abilities and reading measures in three groups of first grad...

متن کامل

Learning to spell in a language with transparent orthography: Distributional properties of orthography and whole-word lexical processing.

We examined how whole-word lexical information and knowledge of distributional properties of orthography interact in children's spelling. High- versus low-frequency words, which included inconsistently spelled segments occurring more or less frequently in the orthography, were used in two experiments: (a) word spelling; (b) lexical priming of pseudoword spelling. Participants were 1st-, 2nd-, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013